AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal Instruction

# Multimodal Instruction

Phi 4 Multimodal Instruct
MIT
Phi-4-multimodal-instruct is a lightweight open-source multimodal foundation model that supports text, image, and audio inputs to generate text outputs, with a context length of 128K tokens.
Multimodal Fusion Transformers Supports Multiple Languages
P
mjtechguy
18
0
Heron Chat Git Llama 2 7b V0
Heron GIT Llama 2 7B is a vision-language model capable of conversing about input images.
Image-to-Text Transformers English
H
turing-motors
20
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase